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Abstract 

Regression models for limited continuous dependent variables having a 
non-negligible probability of attaining exactly their limits are presented. 
The models differ in the number of parameters and in their flexibility. 
Fractional data being a special case of limited dependent data, the models 
also apply to variables that are a fraction or a proportion. It is shown how 
to fit these models and they are applied to a Loss Given Default dataset 
from insurance to which they provide a good fit. 

Keywords: Fractional response variables; censored distributions; Tobit mod- 
els; limited dependent variables; Loss Given Default 

1 Introduction 

Proportions or fractions are of considerable interest in economics as well as 
other sciences. They are usually bounded by and 1 (or 100%). Often, such 
quantities show a substantial probability for adopting one or both of the bound- 
ary values. Such variab l es ha ve been termed "fractional response variables" by 
Papke an d Wooldridg In a recent survey paper on modeling fractional 



data, iRamalho et al.l (|201ll ) list pension plan participation rates, firm market 
share, proportion of debt in the financing mix of firms, fraction of land area 
allocated to agriculture, and pr oportion of exports in total sa les as examples. 



Another example is illustrated in Papke and Wooldridge ( 20081 ). where test pass 
rates are analyzed. 

In insurance, losses are frequently restricted to be positive and below an 
upper bound defined by a contract. We analyze a Loss Given Default dataset 
from an insurance category called "surety". In this example, claims cannot 
exceed a prespecified insured maximum, i.e., the ratio of loss over maximum 
is bounded by 1. On the other hand, for several reasons, the claims often do 
not lead to ultimate losses. The interest is in relating the distribution of this 
variable to a set of explanatory variables by a regression model. 

For a fractional response variable Y, an important type of models focuses on 
the conditional mean J5[y|x] given a vector of covariates x. A popular choice 
is to use the logistic function as a link function between a linear predictor and 
£/[y|ac], but other cumulative distribution functions can also be used. Another 
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semip ar ametric approach r elies o n as sumptions about q uanti les (see, e.g.. Powell 



(|l984h . iKhan and Powelll (|200lh or IChen and Khanl (|200ll )). Whereas these 



approaches are sufficient for the purpose of many studies, in other cases, other 
aspects of the distribution of Y given x, like upper quantiles or probabilities of 
attaining the limits, are of interest, as is the case in our application. In that case, 
parametric models are advantageous. On the other hand, since semiparametric 
models rely on less assumptions, they have the advantage that they are less 
prone to misspecification. 

When there is a non-zero probability that the boundary values are attained, 
it is natural to use models based on censored random variables. These mod- 
els are used in different fields of a pplication. In economics, analyzing house- 
hold expenditure on durable goods. iTobin (1195a) first i ntrod uced such a model 
which later was coined Tobit model by iGoldberger (jl964h . In climate sci- 
ence, precipitation can b e modelled using censored distributions (see, e.g., 



Bardossv and Plate! (jl992l ) or lSanso and Guennil (|2004l )). 



The Tobit model describes the distribution of Y given s as a censored 
normal with expectation u = x'f3. It is therefore often perceived as a model for 
censored data, which it is in the detection limit case. However, it is perfectly 
adequate to use the censored normal distribution as a probability model in 
situations where no actual censoring occurs and the zeros a re genuine va lues of 
the response, as is the case for the original application of Tobin ( 19581 ). The 
use of censored distributions is then a device to obtain a tractable model even 
though the data is not actually censored. 

To support our thinking about the situation to be modeled, it is often help- 
ful to attach the idea of a "potential" to a latent, uncensored response variable 
Y* , of which Y is the censored version. In the case of precipitation, this po- 
tential measures a tendency for rain which may move from zero to way below, 
indicating that the weather develops from cloudy to very dry. For the stan- 
dardized losses in our insurance e xample, the latent variables can be thought of 
as a loss potential. We note that Wooldridge ( 20ld ) calls models for variables 
that have a discrete and a c ontinuous pa r t, wit hout actual censoring occurring, 
corner solution models. In IWooldridgd (|2002l . Chapter 16), it is stated that 
an additional advantage of using a parametric distribution for modeling corner 
solution outcomes is that estimates of quantities such as £7[y|a;] are efficient. 

Th e Tobit model is eas i ly ad justed to the case of an additional upper limit 
for Y ( Roset t and Nelson and thus to fractional data, and the gener- 

alization to replacing the normal distribution by any other suitable family is 
conceptually straightforward. In fact, when using censored distributions, one 
can model all quantities of interest, such as the mean, quantiles, and probabili- 
ties of attaining limits, together. We will focus on this approach in the following, 
using a shifted gamma distribution instead of the normal. The gamma distribu- 
tion is a flexible distribution that is popular in insurance, and it will be shown 
to fit our data well (see, e.g., Figure [2j). 

Models based on a single random variable, such as the Tobit model or the 
censored gamma model, have the advantage of having a parsimonious param- 
eterization entailing easier and more consistent interpretation. However, there 
are situations in which the frequencies of the limits do not follow this parsi- 
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monious description. We therefore also introduce two extensions of the model. 
For instance, in our example there may even be administrative reasons for an 
excessive number of zero losses, due to incentives to place a claim with little 
justification. Such preventive filing may result in a large number of "additional 
zeros". This idea suggests a mixture model, consisting of a censored part, as 
introduced above, and a model for the additional zeros. 

An other approa ch to tackle this problem is called t wo-part models by 



Ra malho et al.l (|201ll ) . These are extensions of the models of lPapke and Wooldridge 



(| 19961 ). Here, a first model describes the occurrence of boundary values. Then, 
the continuous par t ca n be modeled, for instance, by us in g the beta distribution 
(see Paolinoj ( 2001) and Ferrari and Cribari-Netol ( 20041 ) ). Ramalho and da Silva 
(200S) and Cook et al. ( 20081 ) present empirical applications of two-part frac- 
tional response models. We also introduce an alternative extension of the cen- 
sored gamma model based on this two-part modeling idea. Here, the prob- 
abilities of attaining the boundary value(s) are modelled separately from the 
continuous part in between them. 

The rest of the paper is organized as follows. In Section [21 we introduce the 
censored gamma model, show how it can be interpreted, and derive an estima- 
tion procedure for it. In Section [3l two possible generalizations are presented. 
In Section HI we illustrate an application of the models to the dataset mentioned 
above. 



2 The Censored Gamma Model 



In order to e stablish ideas, consider th e Tobit model in its two sided version as 
developed bv lRosett and Nelson It is assumed that there exists a latent 



variable Y* which is, conditional on some covariates x = (a?i, . . . ,x p ) € W, 
normally distributed. This variable is observed only if it lies in the interval 
[0, 1]. Otherwise, we observe or 1, depending on whether the latent variable 
is smaller than or greater than 1, respectively. If Y denotes the observed 
variable, this can be expressed as 



and 



Y*\x ~ Af(fi,a 2 ) 



Y = 0, if Y* < 0, 
= Y*, if0<y*<l, 

= 1, ify*>i. 



(i) 



(2) 



Furthermore, the expectation [i of the latent variable Y* is related to the 
covariates x through 

H = x'/3, (3 E K p . 

For more details, e.g., on inference, we refer to iMaddalal (Il983l ) Chapter 6 , 
and Amemiva ( 1985 ). Chapter 10. Furthermore. iBreenl (|l996l ) and Long ( 1997 ) 
give overviews of models for limited dependent variables. 
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Clearly, the assumption of a normal distribution for Y* is not adequate for 
all data. It is well kn own that the Tobit mod e l is se nsi tive to distributional 
assum ptions (see, e.g., Arabmazar and Schmidt ( 19821 ) or Maddala and Nelson 
Hazi)). A natural alternative is to replace the normal distribution by another 
one. We choose a shifted gamma distribution since it is a flexible distribution 
that is applied in many areas, especially in insurance. Further, it provides 
a good fit to the dataset of insurance claims mentioned above. This choice 
relies on distributional assumptions which have to be checked when applying 
the model to data. 

To avoid unnecessary inflation of notation, we let the boundaries of the 
observed variable be and 1. The model is easily generalized for variables whose 
range of values is any interval [yi,y u ] with yi < y u , though. This might be done 
either by first applying a linear transformation to the respective variable or by 
reformulating the model. The case where the observations are only bounded 
from below is included by letting y u — > oo. 



2.1 The Model 

Generalizing the Tobit model specified in ([TJ and ([2]), it is assumed that there 
exists a latent variable Y* which has, conditional on x, a distribution with 
density fg*(y*) and cumulative distribution function Fg*(y*), 6* being a vector 
of parameters. The observed dependent variable Y then depends on the latent 
variable as in fl2J). 

It follows that the distribution of such a censored variable Y can be char- 
acterized by 

p[y = o}=f;*(o), 

P[Y e(y,y + dy)] = fZ*(y)dy, < y < 1, (3) 
P[y = l] = l_F*,(l). 

Consequently, the density of the observed variable Y can be written as 

fe*(y) =FS,(0)5 (y) + f^(y)i {0<y< i } (y) + (i - F e **(i)My), < y < 1, 

(4) 

where 6o(y) and 5\{y) are Dirac measures and where l{o<j/<i}(y) denotes the 
indicator function equaling 1 if < y < 1 and otherwise. 

In order to extend the model to the regression case, we relate the distribution 
of Y* to the covariates x. This is done by assuming that the main parameter 
i9 of the distribution of Y* , which might be the mean or a scale parameter, is 
related through a link function g to the covariates, 

g{$) = x>(3. (5) 

In the following, we will focus on the case where the distribution of Y* is 
specified as a gamma distribution with a shifted origin. The density and the 
distribution function of a gamma distributed variable with shape parameter a 



Censored Gamma Regression Models 



5 



and scale parameter ■& are denoted by g Q) -d{y) and G a ^(y), respectively. The 
density of a shifted gamma distribution is then 

9*AV* +0 = $^j(y* + tr-^+ty*, y* > 

where a > 0, and its distribution function is G a ^(y* + £). 
The density of the observed Y can be expressed as 



+ (l-G M (l + 0)*i(y), <|/<1. 



(6) 



The use of a gamma distribution with a shifted origin, instead of a standard 
gamma distribution, is motivated by the fact that the lower censoring occurs 
at zero. In this case, the shift £ is needed to obtain a positive probability of 
Y = 0. 

For the regression case, we assume that the scale parameter $ is related to 
the covariates via the logarithmic link function 

log(tf) = x'(3. (7) 

Henceforth and if not otherwise stated, we assume that Y* (and Y) follow 
a (censored) shifted gamma distribution. We will refer to this model as the 
"censored gamma model" . 

Note that if no censoring occurred and £ was set to zero, the censo r ed ga mma 
model would be a generalized linear model ( McCullagh and Nelder ( 19831 )) for 



a gamma distributed variable with a logarithmic link function. 
2.2 Interpretation 

If the focus lies on the latent response variable Y*, the interpretation is straight- 
forward. Since 

E[Y*\x] = ae&-£, (8) 
the marginal effect of a continuous predictor Xj on i£[Y'*|x] is 

dE[Y*\x] 



hoed. (9) 



On the other hand, one might be primarily interested in the observed vari- 
able Y, rather than the latent variable Y*. Its mean and corresponding marginal 
effects are calculated in the following lemma. 



Lemma 2.1 The following holds true. 

E[Y\x] =<xd (G Q+M (1 + - G Q+M (0) 

+ (i + 0(i- GaA 1 + 0) - e (i - . 

and for a continuous covariate Xj , 
dE[Y\x] 



(10) 



dxj 



&atf(G a+M (l + - (11) 
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The derivation of these two equations is shown in Appendix [XI 

We note that the marginal effect of Xj on £?[y|x] is a scaled version of 
the effect on i£[Y*|a:], with the scaling factor depending nonlinear ly on the 
covariates. 

If the interest lies on, say, the probability of Y being zero, P\Y = 0] = 
G Q one can also calculate partial effects on this quantity. For a continuous 
Xj, using similar ideas as in the proof of the above lemma, it is easily shown 
that 

dP[Y = 0\x] = dG a ^(0 

dxj dxj (12) 

= -(3j£g a ,$ (0 ■ 

Finally, one can also consider quantiles. The quantile function F^^(q), for 
q G [0, 1], of Y is given by 

^, € (9) = 0, ifO<g<G M (0, 

= ^- 1 1 (g)-e, ifG M (0<g<G«,*(i + f), (13) 
= 1, if G M (i + 0<g<i- 

The partial effect of a continuous covariate Xj on the g-quantile ^(q) is 
therefore 

— ^ — = 0, if < q < Gcmh£)) 

= /3 i ^G- 1 1 ( g ), ifG M (0<?<Ga,*(l + 0, (M) 
= 0, if G M (l+£) <q<l. 

Note that for the cases q = G a ,ti(£) and q = G a> #(l + £), the function 
^(g) is not differentiable with respect to Xj and, consequently, partial effects 
cannot be calculated. 

2.3 Estimation 

In this section, it is shown how to perform maximum likelihood estimation for 
the censored gamma model using a Newton- Raphson m ethod known as Fisher's 



scoring algorithm (see, e.g.. iFahrmeir and Tuta (|200ll )). 



Denoting generically by all parameters that are to be estimated and by 
t{6) the log-likelihood, Fisher's scoring algorithm starts with an initial estimate 

6 and iteratively calculates (until convergence is achieved) 

e {k+x) = e {k \i(e {k K\(e {k) ), k = 0,1,2,..., 



where 

denotes the score function, i.e., the first derivative of the log-likelihood, and 

/ (6) = E e \s (9) s (0) T ~ 
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is the Fisher Information Matrix. How these two quantities are calculated for 
the censored gamma model is shown in the following. 

First, we reparametrize the shape parameter a through 

a' = log(a) (15) 

to ensure that a attains only positive values. The parameters that are to be 
estimated, therefore, consist of 6 = (a', (3, £). 

Assuming that we have independent data yi, ■ ■ ■ ,y n with covariates xx, . . . , x 
the log-likelihood function can be written as 

n 

H9) = Y j l i (B). 

8=1 



Lemma 2.2 The following relations hold true. 



a 



-^(a)G aA (0, 



1 



d£j(0) 

da' G a ^(0 

+ a (- log(tfj) - t/j(a) + log(j/* + £)) 1 {0<Vi<1} 
a 



{Vi=0} 



l-G M4 (l + 



^(a)G Ml (l + + ^ 1) (o,^ 



%(0) 



5aA (0 



l{ yi =o} + ^ifc I -a + 



<W0 
. (i , C \ 9aA (1 + -, 

+ ^(i + 1 _ G ^ (1 + e) i tol =i } , 

ggi(g) _ g^O 1 . (<*-! 1 , 1 



Vi+t 
ft 



1 {0<^<l} 



(16) 
(17) 



_ ffa,^(l + Q 1 

{o<yi<i} i_G atf .(i+e) {w=1} ' 

(18) 



where 



dlog(r(a)) 
da 



denotes the digamma function (see \Abramowitz and Steauri 11964) ) and the 



functions Ha^ and Ha^ are defined a^\ 



and 



' n<*)Ji 



log(y)y a 1 exp(-y)dy 



r(a) 



log(y) y a exp(-y)dy. 



(19) 



(20) 



1 We note that the functions H^\l,u) and H^\l,u) can be calculated using numerical 
integration. In our applicatio n, we did this by ad aptive quadrature using the QUADPACK 
routines 'dqags' and 'dqagi' (Piessens e t al. (1983)) available from Netlib. 
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The derivation of the scoring functions is shown in the following. 

At first, we infer from Q that the likelihood function of an interval censored 
gamma distribution can be written as 

Ly(a,l9,£) = G a ^(Ol{y=0}+ga,d(y+0 1 {0<y<l} + ( 1 - G a,$(l + 0) 1 {y=l} ( 21 ) 

which is equivalent to writing 

L y (a, 0, £) = G ai #(0 1{v=0} ■ 9a Av + e) 1{0< » <1} • (1 " GaA 1 + 0) 1{y = 1} • (22) 

It follows that we can write the log-likelihood function £i(0) of an observation 
V% as 

£i(0) =log(G aA (0)l{ K =o} +log(Sa,*i(l/i + 0) 1 {0<y i <l} 
+ log(l-G Mi (l + 0)l{ w =i} 
= log(G aA (0)l{ w=0 } 

+ ( -alog(^) - log(r(a)) + (a - 1) log( W + £) - ^±1 ] 1 {0<2/i <i} 



+iog(i-G aA (i+e))i {2/i =i }! 

where 

i?i = exp(a;-/3) and a = exp(a'). 

The derivative of li with respect to the parameter a' in (|16p is then calcu- 
lated using the following identity. 



dG a AO 



9G a i ( | 



3a 3a 
9a 



Via) _-, 1 , 

/ y Q exp(-y)dy + -— / log(y)y a exp(-y)dy 



T(a) 2 

= -V'(a)G a ^(0 + ^ ) (o,|). (23) 
Next, using 



2i(0) _ diiiO) d#i _ 81,(6) 



and (|42p . differentiating with respect to /3& gives the result in (|17p . The 
calculation of the derivative with respect to £ in (|18|) is straightforward. 

For the Fisher-scoring algorithm and for asymptotic inference, we calculate 
the Fisher Information Matrix 



I{0) k ,l = Eg 



dl(G) dt(0) 

oe k ddi 



, l<k,l<2+p. 
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Because of the independence of the observations, this can be written as 



En 



E 



Sim 



E 



(0) 



E^ 

i=i 



i=l 



i=l 



80, 



81,(0) de t {o) 



are shown in Section 



The specific calculations of the entries Eg 
IS. II in the supplementary material. 

As mentioned before, the Fisher Information Matrix 1(0) is used in the 
Fisher-scoring algorithm for fitting the model and for asymptotic inference, in 
particular to estimate standard errors of the coefficients (3. 



3 Two Extensions of the Model 



A salient feature of the model defined in (|3|) and of the Tobit model is the 
assumption that the same parameters govern both the behaviour of the un- 
censored values as well as the probabilities of being censored from below or 
above. 

In order to relax this assumption, vari ous extens i ons h ave been proposed. 
Sampl e selec tion models, first introduced by Heckman ( 19761 ). are one approach. 
Cragg ( 197ll ) came forward with another proposal relaxing the aforementioned 
assumption of one set of parameters governing the entire model. 

For count data, similar problems can arise: there may be more zeros than 
expected by a simple model, which would otherwise fit well. Basically, two 
dif ferent kind s of s olutions have been put forward there. 



Aitchisonl ()1955l ) first proposed t o model the zeros and the values bigger 



than zero separately. iMullahv (|l986l ) used a mixture consisting of a distribu- 
tion for the whole range of data, including zeros, and a point mass at zero to 
capture extra zeros. These two types of models have been e xtensively appli ed 
in various areas of r esearch including manufactu ring defects d L amber tj (|l992)). 
patent applications (ICrepon and Dueuetl (| 19971 )). road safety (iMiaoul (I l994)). 



species abundance (jWelsh et al 



1996 )), medical c onsul ta tions (IGurmu 



1997)), 



use of recreational facilities ( Gurmu and Trivedil ( 1996 ); Shonkwiler and Shaw 
(|l996l )). and sexual behavioiir~ frIeilbronl (|l99l )). iRidout et al.1 (|l998D ~ give an 
overview of these models. 

Our two extensions are based on similar ideas. The main difference is the 
way in which the zeros are modeled. In the first extension, the zeros and the 
non-zero values are modeled separately assuming that the mechanisms that 
govern the probability of Y being zero and the non-zero part are different. In 
the second extension, the zeros are modelled as a mixture of two mechanisms. 
One is responsible for artificial or extra zeros whereas the other part is the 
censored gamma model introduced in Section [21 
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3.1 The Two-tiered Gamma Model 



Inspired by the approach of ICragd (|197lh . we extend the model in ([3]) by al- 



lowing for two different sets of parameters, one governing the probability of Y 
being zero, and the other the behaviour for < Y < 1. 

Alternatively, the model could also be extended by allowing for a different 
set of parameters governing the probability of Y being one. The extension 
presented here, which we will call two-tiered gamma model, is mainly motivated 
by the presumption that zeros are generated by another mechanism than the 
one that governs the rest of the data. We remark that the extension to a 
"three-tiered" model including a different set of parameters for governing the 
probability of Y being one is straightforward. 

More specifically, in the two-tiered gamma model, it is assumed that there 
exist two latent variables 

Y i ~ G a,M + 0, with = eMx'j), 7GR P 

and 

Y 2 * ~ Ga,M + truncated at 0, with ■& = exp(x'(3), (3 € M. p . 

The first latent variable Y£ is again following a shifted gamma distribution, 
whereas the second variable Y 2 * nas shifted gamma distribution that is lower 
truncated at zero. These two latent variables are then related to Y through 

Y = if Y* <0, 

= Y" 2 * if < Y{ and Y 2 * <1, 

= 1 if < Yj* and 1 < Y 2 * . 

In other words, the two-tiered gamma model first decides whether Y is zero or 
not. This is modeled in the style of a probit model, using, however, a cumulative 
gamma distribution function instead of a normal one. It is then assumed that, 
conditional on 7 > 0, < Y < 1 has a lower truncated and upper censored 
gamma distribution. 

The distribution of Y can then be characterized as follows. 

P[Y = 0] =G^(C), 

l-G 5 (f) 

P[Y £ (y, y + dy)] =g a ,$[y + 1 _ g a '^/^ d Vi 0<y< 1 > (24 ) 
P[Y = 1} =(l - G M (i + O h'r^fJv 

with 

■9 = exp(xf/3), i? = exp(;r / 7), f3,j € M p , a,£>0. 

Again, g a ${y) denotes the density of a Gamma(a,!?) distributed variable and 
Ga,@(y) is t ne corresponding distribution function. 

We remark that the distributions in both parts of the two-tiered model, i.e., 
the part modeling the probability of Y being zero and the part governing the 
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behaviour of < Y < 1, are assumed to have the same shape parameter a and 
the same location parameter £. Consequently, if f3 = 7, or 1? = #, the two-tiered 
gamma model presented here and the aforementioned censored gamma model 
coincide, which means that these two models are nested. This is convenient for 
model comparison since it allows to use a likelihood ratio test to compare the 
two models. 



3.2 Estimation of the Two-tiered Gamma Model 

Having in mind that the censored gamma model is nested in the two-tiered 
gamma model, we restrict ourselves to estimating the coefficients /3 and 7 of 
the two linear predictors using Fisher's scoring algorithm. The shape parameter 
a and the location parameter £ could be estimated via numerical optimization in 
an outer loop with starting values obtained from first fitting a censored gamma 
model. 

With 6 = (/3,7), the log-likelihood function of the model can be written as 

*(0) = £2=i4(0) with 

m =\og(G aA (0)i {yi= o } 

+ (log(g aA ( yi + 0) + log(l - G aA (0) - log(l - G Ml (£)))l {0<w<1} 
+ (log(l - G aA {l + + log(l - G aA (0) - log(l - G a ^(0))l {yi= i } , 

where 

i?i = exp(a;-/3), = exp(x-7). 

The score functions are 



dim _ fyi + C ±g*M0 M 



+ i-g m< (i + o i-GWfl/ {w=1} 



(25) 



and 



(0 

(26) 

The entries of the Fisher Information Matrix 1(0) are presented in Appendix 

El 



3.3 The Zero-Inflated Gamma Model 



The extension presented in this section is motivated by the following idea. As- 
sume that our quantity of interest follows indeed a censored, shifted gamma 
distribution. However, additional, artificial zeros occ ur by some other mech- 
anism and thus there are more zeros than expected. IDeaton and Irish 
used such an extension of the Tobit model for modeling expenditures in house- 
hold budgets. Recentl y, a zero-inflated model for censored continuous data has 
also been presented bv lCo.tmier and VictoT^rl 
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These additional zeros are now allowed to follow their own model, in contrast 
to the two-tiered model where all zeros were described together. This view may 
make sense in specific applications like insurance, where some of the claims that 
result in zero losses may be cases which were filed in order not to miss a formal 
deadline or for similar artificial reasons. 

In the zero-inflated model, the existence of two latent variables is again 
assumed, 

5? ~ 1) and Y 2 * ~G a j(y* 2 +0 

with n = x'~y and i? = exp(x'(3). 

The censored gamma model is not nested in the zero-inflated model in the 
classical sense. However, the zero-inflated model coincides with the censored 
gamma model at the boundary of its parameter space, namely if fi — > — oo. 
For the reason of simplicity, we opt for the normal distribution. I.e., the extra 
zeros are model using a probit model. Alternatively, one could also use the logit 
distribution. 

These two variables are then related to Y through 

Y = if Yi* <0, or if 

< Y* and Y 2 * <0, 
= Y 2 * if < Y* and < Y 2 * <1, 

= 1 if < Yi* and 1 < Y 2 *. 

The variable Y£ first decides whether the observed response variable Y is 
zero, i.e., if Y* < it follows that Y = 0. Next, conditional on Y* > 0, Y is 
distributed according to a censored, shifted gamma distribution. 

This means that the zeros are governed by two different components of the 
model. First, zeros can arise if Y* is smaller than zero. And secondly, they 
can occur if, conditional on Y* > 0, Y 2 is smaller than zero. Metaphorically 
speaking, we add extra mass at zero to the censored gamma distribution, which 
can account for potential extra zeros. This approach allows us to distinguish 
structural and extra zeros. 

Note that the main distinctive feature of this model, in contrast to the two- 
tiered model presented in the previous section, is that the distribution of the 
second tier of the model is lower censored instead of lower truncated. 

As stated above, we choose to model the extra zeros using a probit model, 

i.e., 

p ■= P[Y* < 0] = Hx'j), 7 G M. p . (27) 
Consequently, the distribution of Y can be characterized by 

P[Y = 0] = Po + (i- Po ).G a ,o(0, 
P[Y e(y,y + dy)}=(l-po)-g a Ay + Ody, < y < 1, (28) 
P[Y = 1] =(l- P0 ).(l-G M (l+O), 

where 

p = $( £C / 7), = exp(aj'/3), 7,/3eK p , a,£>0. 
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We note that the zero-inflated model reduces to the censored Gamma model 
in the limit /i — > — oo, i.e., at the boundary of the parameter space. This 
means that a straightforward likelihood ratio test for model selection does not 
apply here. In Section FOl in the application, we use a simulation based testing 
procedure to compare these two models. 



3.4 Estimation of the Zero-Inflated Gamma Model 



Since the EM ([Dempster et al. ([1973)) algorithm lends itself naturally when 



it comes to fitting mixtures of distributions and because calculations of scores 
and the Fisher Information Matrix would be overly complicated, we use the 
EM algorithm here. 

The EM algorithms presented in the following finds the maximum likelihood 
estimators of the parameters 6 = (a,/3,7). The location parameter £ is fixed 
and assumed to be known. Again, £ could be obtained from first fitting the 
censored gamma model or it could be estimated through numerical optimization 
in an outer loop. Alternatively, the values obtained from the EM Algorithm 
together with the estimated £ from the censored gamma model can be used as 
starting values for generic optimization algorithms such as, for instance, quasi- 
Newton methods. We note that in some examples we observed convergence 
problems when using quasi-Newton methods without reasonable starting values. 

With regard to the EM algorithm, we introduce two latent data variables 
Z and Y*. For each i, Zi indicates whether the observation belongs to the 
extra zero part of the model (Zi = 0) or to the censored gamma distribution 
(Zi = 1). The second missing data variable Y* is for the censored gamma part 
of the model. It denotes the value of the underlying latent variable Y* which 
then is censored at zero and one. The complete data W therefore consists of 
(Z 1 ,Y 1 *),...,(Z n ,Y r :). 

Using this, the complete-data likelihood can be written as 

n 

Lw(6) = J] (SteT)) 1 -* • ((1 " *&7)) • 9«MY* + Of', (29) 
i=i 

where log(i9j) = x^fl and = (a,/3,7), and the complete-data log-likelihood is 
£ w (0) = ^ (1 - Zi) log (<5>(x' a )) + Zi log ((1 - <5>(x' a )) ■ 9aA (Y* + o) 

i=l 
n 

= J2 (1 - Zi) log (*(a/ i7 )) + Zi log (1 - $(^ 7 )) 

8=1 

n / 

~~ -a log(0i) - log(T(a)) + (a - 1) log(Y* + £) 



5> 

i=i 



Y* + i 



0i 

(30) 



The EM algorithm produces a sequence of estimates {0"\ t = 0, 1,2, . . . } 
by alternatively applying two steps: 
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E-step. Compute the expected value of the log-likelihood, with respect to 
the conditional distribution of W given y under the current estimate of the 
parameters 0®: 

Q^)(e) = E 9it) [t w (0)\y]. 
M-step. Update the parameter estimated according to: 

(t+l) = argmaXe q(*+i)(0). 

From (|3U|) , we infer that in the E-step three different expectations have to 
be calculated: E 0(t) [Z^y], E 0(t) [Y* + £\y], and E g(t) [\og{Y* + For the 

sake of notational brevity, we introduce the following two abbreviations: 



and 



> i 

The three expectations are then calculated as follows: 



E e(t) i z i\y} 







if y% = o, 

if Vi > 0, 



(31) 



and 



Em [Y*+t\y} = < 



(*)#. 



G 



(0 



a 



1-G (t) (l+0 

j i 

l-Bf'CL+0 



E e(t) [log(y/ + e)|y] 



log(0, 



(*)> 



H 



(1) 

«(*) 



+ 



log(y» + 



log(0. 



fli (t) (0 



(l) / 1+5 



+ 



if = 0, 
if < jfc < 1, 

if = l, 



if Vi = 0, 
if < yt < 1, 



if J/i 



(32) 



(33) 



Concerning the M-step, we note that the log-likelihood in (J30J) splits into 
two terms which can be maximized separately. The first term contains the pa- 
rameters of the extra zero model part (-y) and the other contains the parameters 
of the censored gamma distribution (a and (3). 



4 An Application 

4.1 Loss Given Default Data 

We apply the models presented above to a dataset from insurance. A surety 
bond is a contractual agreement among three parties: the contractor who per- 
forms an obligation, the obligee who receives the obligation, and the surety 
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provider, in our case the insurance company, who covers the risk that the con- 
tractor fails to fulfill the obligation. 

The dataset consists of European surety bonds that resulted in a claim. 
The ultimate loss for these claims is called "Loss Given Default" (LGD). For 
each bond, the maximal amount that is covered by the insurance company, a 
quantity called "face value" (FV), is a priori determined. This allows us to 
standardize the LGD by dividing it by the face value, such that our variable of 
interest lies between and 1 

LGD , , 

0<^<1. (34) 

We have worked with the original dataset, but for confidentiality reasons 
the results presented here are obtained on the basis of a subsample of the 
original set. The subsample, consisting of more than 5000 bonds, is obtained by 
using a random selection mechanism, with selection probabilities that depend on 
certain characteristics of the respective bonds, so that the value of the average 
standardized loss LGD/FV is altered in order not to reveal the true average. 
As a consequence, the results presented in this paper are not the real ones but 
are close enough to reflect the major phenomena. We assure that the fit the 
models provide to the original data is at least as good as for the subsample. 

The standardized losses are shown in Figure[TJ Since the insurance company 
can often recover costs, observations with no ultimate loss at all are frequent. 
In fact, about 52% of all bonds in the subsample have no loss. On the other 
hand, there is a major proportion (15%) of bonds that have full loss, i.e., a 
LGD/FV equaling 1. 

Apart from providing a probabilistic model for the surety LGD, the purpose 
is also to explore the relation of the losses to certain covariates which are shortly 
described in the following. 

The relative default time (RDT) of a bond is the proportion of time that 
has passed at default since its issuance over the total life span of a bond. This 
quantity allows us to explore the time development of the losses from the issu- 
ing date to the end date (maturity). Experience and size are two categorical 
variables, each attaining three different levels, which represent the experience 
(low, mid, high) and the size (small, medium, large) of the contractor. There 
are three different types of surety bonds called maintenance, performance, and 
hybrid bonds. Hybrid bonds are bonds that are both maintenance and per- 
formance bonds. There is an additional category denoted "other bonds" for a 
small number of bonds of various other categories. Usually, European surety 
bonds do not cover the whole amount of an underlying contract but only a 
certain fraction. Information about his percentage is included as an additional 
covariate. In Table [2] in the online supplementary material, we report sum- 
mary statistics for the continuous covariates and relative frequencies for the 
categorical variables. 

4.2 Results 



We first estimate the censored gamma model of Section [2] with no covariates 
and illustrate its fit in Figure [TJ The dashed red line represents the fitted 
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Histogram of LGD/FV 

° n 52.44% (52.36%) 1 5.38% (1 5.73%) 



oo 
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ITT t Kf-r 



o.o 



0.2 



0.4 0.6 

LGD/FV 
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Figure 1: Histogram of LGD/FV and fitted censored gamma model with no 
covariates. The numbers above the blue arrows represent the percentage of 
LGD/FV's being exactly zero or one, respectively. In parentheses are the cor- 
responding numbers as predicted by the censored gamma model. The dashed 
red line represents the fitted model. 



model. The numbers in parentheses above the bars show the fitted probabilities 
of being zero and one. Apparently, the plain model with no covariates fits 
the data well. The observed and the modeled probabilities of being zero or 
one are very similar and the continuous part of the model accurately fits the 
histogram^ For comparison, we have also fitted the standard normal Tobit 
model in its t wo-sided version, as well as a co rresponding model using a skewed 
t distribution kzzalini and Caoitanicl ». See the supplementary material 
for more details. Both models provide worse fits than the censored gamma 
model. A plot (Figured]) illustrating the fits can be found in the supplementary 
material. 

Next, we fit a model using only the face value, more specifically the loga- 
rithm and the squared logarithm of the face value, as covariate. We illustrate the 
fitted model in Figure [2} Th e color ed continuous lines are non-parametrically 
fitted quantile (see iKoenkerl (12005)) and mean curves (c alculated using lo- 
cal polynomial regression, see IChambers and Hastid ( 19921 ) . Chapter 8). The 
dashed lines represent the corresponding quantiles and mean of the fitted model 
calculated using the result in Lemma J2.ll We also fit the conditional mean 



aJ23L 

: |Papk( 



model for fractional response (FR) of iPapke and Wooldridge (1996Q. Here , fit- 
ting is done using quasi-maximum likelihood (see Gourieroux et al. ( 19841 ) for 



2 Due to the large number of observations, a chi-square goodness of fit test still shows 
significant deviations. 
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LGD/FV vs. Face Value 
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Figure 2: Scatter plot of face value (on a logarithmic scale) vs. LGD/FV. 
The jittered points in the bars below 0.0 and above 1.0 represent bonds with 
LGD/FV being exactly zero and one, respectively. The colored solid lines are 
non-parametrically fitted quantiles and mean. The dashed lines represent quan- 
tiles and mean of the fitted censored gamma (CG) model. The green dotted line 
represents the fitted conditional mean of the fractional response (FR) model. 
Logarithmic and squared logarithmic face value are taken as covariates. 

details) based on the Bernoulli log-likelihood function. 

The non-parametrically fitted mean and the mean of the fitted censored 
gamma model are very close together. This indicates that the censored gamma 
model provides a good fit to the conditional mean. Moreover, the non-parametrically 
estimated quantiles and the quantiles from the fitted censored gamma model 
match well. I.e., the censored gamma model not only models the mean appro- 
priately but the entire distribution. In addition, the fitted mean of fractional 
response model is very close the mean of the fitted censored gamma model. 
Again, we have also fitted the Tobit model and the skewed t version. Compared 
to the censored gamma model, both models provide worse fits (see Figure [5] in 
the supplementary material). 

Finally, we fit the censored gamma and its two extensions, i.e., the two- 
tiered and the zero-inflated model including all covariates. For the two ordinal 
factorial variables experience and size, we use orthogonal polynomial contrasts. 
Concerning the categorical variable type, we use treatment contrasts with main- 
tenance as baseline level. For the censored gamma model, we use the Fisher 
scoring algorithm presented above. In the case of the two-tiered and zero- 
inflated models, we use the algorithms presented in this paper to determine 
good starting values for quasi-Newton methods. Starting values for the param- 
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eters that are not estimated with these methods, i.e., the shape parameter a 
and the location parameter £, respectively, are obtained by taking the values 
from the ones in the fitted censored gamma model. We then estimate the two 
models using quasi-Newton methods. Concerning the censored gamma model, 
estimates of standard errors are calculated using the Fisher information. For the 
other two models, standard errors are obtained by numerically approximating 
the Fisher Information Matrix at the optimum. 

The results are reported in Table [TJ The log- likelihood of both the two-tiered 
and zero-inflated models are considerably higher than the one of the censored 
gamma model. This is also reflected in considerably smaller AIC values, the 
zero- inflated model having the lowest AIC. A likelihood ratio test clearly favors 
the two-tiered model over the censored gamma model. This is also true for 
the zero-inflated model. For the latter, the null hypothesis is on the boundary 
of the parameter space, and the usual asymptotics do not apply. We there- 
fore use a simulated test instead. To be more specific, the distribution of the 
difference in log-likelihoods between the two models under the null hypothesis 
is characterized by 1000 simulated values. A sample from this distribution is 
generated by simulating data from the null hypothesis, i.e., from the estimated 
censored Gamma model, then fitting both models, and calculating the differ- 
ence in the two log-likelihoods. The lowest simulated difference obtained out 
of the 1000 samples was about 28.6. We conclude that the observed difference 
of more than 200 is clearly significant. Next, for dis c rimin ating between the 
two extended models, we apply Vuong's test ( Vuong ( 19891 )). Since we know 
that the zero-inflated model does not reduce to the censored gamma model, it 
follows that we are not in the overlapping case. Thus, we can use the Vuong's 
non-nested hypothesis test. The test statistic has a value of —2.26 under the 
null hypothesis that both models are equally close to the true model. Thus, at 
a 5% level, the null hypothesis is rejected in favor of the zero-inflated model. 
This gives support to the idea that there are indeed extra zeros in the data. 
These extra zeros are interpreted as zero losses from claims that were filed for 
administrative reasons and not because there was a true default event. As be- 
fore, we have also fitted the Tobit model and skewed t distribution model using 
all covariates. The Results are reported in Tables [3] and 0] in the supplementary 
material. In all cases, the gamma models have considerably lower AICs, and 
the corresponding differences in log-likelihood are always larger than 100, ex- 
cept when comparing the two-tiered gamma model with the two-tiered skewed 
t model where the differences is about 8 in favor of the gamma model. This 
means that Vuong's test favors the gamma model in all cases. 



4.3 Interpretation of Results 

Having come to the conclusion that the zero-inflated model provides the best fit 
to our data, we interpret the obtained results. Interpretation is not as straight- 
forward as, for instance, in the basic censored gamma model case (see Section 
I2.2p . In contrast to that, in the zero-inflated extension there are two linear 
predictors rj = x'j3 and [i = x'~f. Partial effects on, say, the conditional mean 
therefore include both sets of coefficients (3 and 7. We will focus on i^Vla;] 
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and P[Y = 0\x] in the following. These two quantities and their corresponding 
partial effects are calculated in the following lemma. 

Lemma 4.1 For the zero-inflated model, the following relations hold true. 

E[Y\x] = (1 - *(/i))Ci^ (35) 



where 



C l a , u =<x& (G a+M (l + £) - G a+1> 40) , m 
+ (1 + 0(1- G a M + 0) - £ (1 - > 



and 



p[y = 0|x] = $(//) + (1 - $(/x)) • G^(£). (37) 
For a continuous covariate Xj, we have 
dE\Y\x] 



dxj 



where 



and 

dP[Y = 0\x] 



PjC 2 aM (l - *(/,)) - 7;<K^<W> (38) 
= a0(G Q+M (l + - G a+M (0), (39) 

= -0;& a ,*(O(l " *M) + 7i0(M) (1 " • (40) 



The lemma follows from (j28j) together with Lemma 12.11 We see that the 
partial effects contain (3 and 7, both entering in a non-linear manner and in- 
teracting with each other. This follows from the fact that 1? = exp(x'/3) and 
fj, = x'-f. Because of this we came to the conclusion that interpretation is best 
done in a graphical way. This is done as described in the following. 

In Figure [31 contour plots of the conditional expectation, £J[y|aj], and the 
probability of being zero, P[Y = 0\x], for the fitted zero-inflated model are 
shown. Contour levels are obtained with respect to varying values of the two 
linear predictors n = x'/3 and \i = x'-f. The arrows represent the effects of the 
covariates. The middle point of the arrows are the levels of F[K|a;] and P[Y = 
0\x], respectively, attained when taking all continuous covariates at their mean 
and the categorical variables at their most frequent level. We focus on the three 
variables face value (FV), relative default time (RDT), and experience (Exp) 
since these are believed to be the most important variables from a practical 
point of view. Interpretation for the other covariates is analogous. For the 
two continuous covariates face value (FV) and relative default time (RDT), the 
blue and red arrows in Figure [3] are obtained by increasing the variables by one 
standard deviation from their mean. For the categorical variable experience 
(Exp), the green arrows illustrate the changes in £/[y|ac] and P[Y = 0\x] when 
moving from the lowest level to the middle one and then to the highest level of 
experience. 

Concerning the conditional expectation, the blue arrow of the FV shows that 
an increase of FV by one standard deviation leads to an increase in _E[y|a;] by 
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Figure 3: Illustration of effects of main covariates for the zero-inflated model. 
On the left hand side, a contour plot of the conditional expectation, i?[Y|a:], 
as a function of the two linear predictors r] = x'f3 and fi = x'y is shown. On 
the right hand side, the same contour plot is shown for the probability of being 
zero, P[Y = 0\x]. The arrows represent the effects of changing covariates. For 
the two continuous covariates face value (FV) and relative default time (RDT), 
the arrows are obtained by increasing the variables by one standard deviation 
from their mean. For the factorial variable experience (Exp), the two arrows 
indicate the changes when moving from the lowest level to the middle one and 
then to the highest level. 



about 0.05. RDT, on the other hand, has virtually no effect on the mean. 
Even though both linear predictors change considerably when increasing RDT, 
the change is along a contour level and has no effect on the value of i£[Y|a:]. 
Concerning the experience, we observe strong effects when going from low ex- 
perience to middle and high, with a total decrease of about 0.17. 

For the probability of being zero, the picture is slightly different. FV has 
only a small effect on P[Y = 0\x], whereas increasing RDT by one standard 
deviation results in an increase of about 6% in P\Y = 0\x]. Experience again 
has a strong effect. P\Y = 0\x] increases by more than 20% when going from 
low to high experience. 

5 Conclusion 

Three special regression models for fractional response variables that attain 
their boundaries frequently were presented. The first model determines the 
distribution of the values between the limits and the frequency of the limiting 
values in a parsimonious way. Two extensions of this model to cover cases in 
which the frequencies of the limits do not follow this parsimonious description 
were introduced as well. The models were applied to a LGD dataset from 
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insurance. They were found to fit the data in a specific insurance application 
better than other popular parametric models. 
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Model 

Covariate 


Censored 
Coef Std. Err. 


Two-Tiered 
Coef(/3) Std. Err. Coef( 7 ) Std. Err. 


Zero-Inflated ^ 
Coef(/3) Std. Err. Coeffr) Std. Eft. 


Intercept 


3.9 0.34 *** 


3.9 0.33 *** -3.2 0.61 *** 


4.1 0.35 *** 0.023 0.18 j§? 


RDT n 11 h 

Quad 


-0.17 0.10 • 
0.074 0.35 


0.30 0.10 ** -0.45 0.079 *** 
1.6 0.35 *** -1.1 0.26 *** 


0.29 0.10 ** 0.35 0.057 If* 
1.6 0.35 *** 0.88 0.20 *** 


Experience _ , 
Quad 


-0.82 0.076 *** 
0.12 0.051 * 


-0.39 0.064 *** -0.67 0.066 *** 
0.064 0.045 0.068 0.041 • 


-0.38 0.065 *** 0.42 0.037 *#* 
0.059 0.046 -0.017 0.026 | 


Size Lin 
Quad 


0.56 0.32 • 
0.66 0.20 ** 


0.35 0.37 0.44 0.24 • 
-0.17 0.24 0.85 0.15 *** 


0.34 0.38 -0.36 0.19 • §■ 
-0.18 0.24 -0.68 0.12 


T7 \r 1 Lin 
Face Value _ 

Quad 


-0.80 0.071 *** 
0.50 0.068 *** 


-0.96 0.065 *** -0.0048 0.050 
0.15 0.053 ** 0.49 0.064 *** 


-0.99 0.070 *** -0.054 0.047 o" 
0.18 0.054 ** -0.33 0.043 #* 


Hybrid 

Type Performance 
Other 


2.9 1.5 • 
0.015 0.12 
0.23 0.16 


2.0 1.2 • 2.7 1.2 * 
0.16 0.11 -0.12 0.099 
0.52 0.17 ** -0.20 0.12 


1.7 1.1 -1.9 0.80 * 
0.17 0.11 0.12 0.070 • 
0.57 0.17 ** 0.19 0.095 * 


Ins. Frac. 


1.2 0.56 * 


1.5 0.49 ** -0.43 0.39 


1.6 0.49 *** 0.40 0.28 




Value Std. Err. 


Value Std. Err. 


Value Std. Err. 


Gamma Par. !°^^ 
log(0 


-1.5 0.050 
-2.4 0.093 


-0.54 0.067 
-4.5 0.47 


-0.57 0.073 
-4.3 0.44 


Log-Likelihood 


-7898.4 


-7684.9 


-7680.5 


AIC 


15826.8 


15425.9 


15417.1 



Table 1: Fitted censored, two-tiered, and zero-inflated gamma models including all covariates. Codes for significance levels: '***': 
p < 0.001, '**': 0.001 < p < 0.01, '*': 0.01 < p < 0.05, '.': 0.05 < p < 0.1. 
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A Proof of Lemma 2.1 

Firstly, a censored gamma distribution with density as in ([U]) has expectation 

E[Y\x] =0 • G M (0 + / ygaAv + 0*V + 1 • 0- ~ GvA 1 + 0) 
Jo 

= (z- t)g a A z ) dz + (i - + 0) 

= ai ?(G a+1 ^(i + o - Ga+i,*(6) + e^co 

=tW?(G? a+lltf (l + 0-G«+i.*(0) 

+ (1 + (1 - G a ,*(l + 0) - C (1 - , (41) 

where in the third line we have used the identity (j47j) given in the supplementary 
material. 

Secondly, for a continuous Xj, using 

g a ,i 3 = » ( 42 ) 



07? ft? ^^U/ 1? 

or 



and the fact that 



-«3a+i,i? (6 , (43) 



9?? 



9sj 



we can compute the partial derivatives of £J[y|a;] with respect to Xj as 
BE\Y\x\ 

g x l 1 = ~ Hag a+1 ,» (0 #h + a^(G Q+1 ^(l + £) - G Q+M (0) 

1 + £ £ 

- ^— ffa+i,tf (1 + + ard-g a+ i$ (£) tffij 

+ (1 + 0^+1^(1 + 0^ 

=cn?(G Q+M (l + - G a+ i^(0)/3j- (44) 
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Supplementary Material 

S.l Fisher Information Matrix for the Censored Gamma Model 

In the following derivations, we will often use some identities and results on 
integrals that we list in Section IS. 31 below. 
With (USD, it follows that 



da' da' 



= E e 
+ E e 
+ E e 



a 



a (- log(tfi) - ip(a) + Iog(j/< + 0) 1{0 <2/I <i}) 



"{Wi=l} 



+ / (a (- log(^) - V(a) + log(i/j + 0)) (l/< + 0% 



+ 



a 



-V(a)G aA (1 + + ffP (o, ^) ) ) • (1 - G Mi (1 + 0). 



i-G aA (i+o 

Using (f5Tj) and ([52]) . the middle summand of this expression is calculated as 



(a (- log(tfi) - ip(a) + log(y, + 0)) 2 + 0% 

=a 2 (log(0 i ) + ^(a)) 2 (G QA (l + C) - 



- 2a 2 (log(tf i ) + ^(a)) (log(^)(G aA (l + £) - G aA (£)) + 
+ a 2 log(^) 2 (G Mi (l + - + 2a 2 log^tfW 



+ a 2 ^) 



=aV(a) 2 (G QA (l + e)-G clA (e))-2a 2 ^(a)^ 1) ( js ^) + (X ^) 



Censored Gamma Regression Models 



29 



From this follows that 
Eg 



dij d£j 
da' da' 

2 



a 



,2 



-iP(a)G aA (0 + HW (q ,i 



+ 



l-G M .(l + 



-V(a)G QA (l + 0+^i 1) (0, 



For the remaining entries of the Fisher Information Matrix, the calculation 
procedure is similar to the one made before. That is, the computation of each 
expectation can be split in to three terms of which the middle term, correspond- 
ing to the non-censored part of the model, requires more effort to compute. In 
the following, we therefore first calculate the corresponding middle term in each 
case. 

With (33), USD, (USD, and ggi, we calculate 



Eg 



a (- log(#j) - tp{a) + \og{yi + £)) %ik -a + 



0i 



L {0< W <1} 



--a 2 x ik log(^)(G aA (l + £) - + 
- Q 2 x ifc log(i? i )(G Q+ i i ^(l + C) - G Q+ i A (£)) - a 2 ^fc^(a)(G Q+ i^(l + C) - 



- Q 2 x ifc log(^)(G aA (l + - GaftiZ)) - a 2 x lk H^ 
+ a 2 x ik log(#j)(G a+ i + -G Q+ i A (^)) + a 2 x ifcj ff a+ i 



i i + 1 



=a 2 x ifc V(a)(G Q+ i A (0 -G a+Mi (l + + (1 + f)) 

=a 2 x ifc (ip(a)tfig a+1} #. (! + £)- V>(aO*!% a+ i i1?i (£)) 



Censored Gamma Regression Models 



30 



Using this result, (fT6|) . and (fTT|) . we get 

0& % 



E, 



--E, 



da' d[3 k 
a 



+ E e 



)^(« + ^(oi))^f^ 1( „. 
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+ E B 



a (- log(tfj) - V(a) + log(yi + 0) ( ~ a + Z1 ^~ 2 - j 1 {o<y i <i} 
' -a (-^(a)G M< (l + Q + iff > (o, ffi)) ^ + g , ^ (1 + g) 

i-G aA (i+e) i-c aA (i+e) 

< • (£) (-^(a)G Q ^(0 + (o, |' 



+ x ik a 2 {ij)(a)'&ig ol+1 A (£ + 1) - i/>(a)#i&*+lA (0) 



IJ « .a ' 



o(l + • J aA (1 + (-^(a)G Mj (1 + + (o, ^ 



l-G M .(l + 
Next, with (@7J), (|35|). and (gSP, we calculate 
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Zifctfii ( -« + — — I l{o< K <i} 

= x ik xua 2 (G a ^ r (l + 0~ Ga^O) - 2a 2 x ifc Xii(G Q+ i A (l + - G a+ i A (£)) 
+ o(a + l)x ik xu(G a+ 2^(l + ~ G a+2 ,^(0) 

= a 2 X ifc Xi/1?i (5a+l,i?i (1 + - 9a+l,#i (0 - (1 + + (£)) 



Using this result and (jrfj) . we see that 
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Moreover, with (|39j). (|51j) . (j5lj) . and (g5D, we get 

Z:V) a (- log(tfi) - ^(a) + log(y l + £)) (j^J ~ 1 {0< J/l <i} 
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With this equation, (fl~6j) . and (fTHj) . we calculate 
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With (g7|, (@9D, and we calculate 
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Using the above result, we have 
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Next, with (@9j), ^50]), (|15]>. we calculate 
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Finally, using this result, we have 
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S.2 Fisher Information Matrix for the Two-tiered Gamma Model 
First, with flUD and (jig)), we get 
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Next, with (gTJ) and the identity in (gBJ), we get 
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Finally, we calculate 
' dL at 



^. ( i_ G (0) J { {0<K<1} + te=1}) 



-XifcXu 



^ i>a (0(i-Ga i>a (0) 



S.3 Useful Identities and Integrals 

By partial integration, we calculate 

G «+i.*(0= ^+i r | a + 1) ^ J/ a exp(-y/0)cfo 

= ^r|a + D ( " f,Jexp( " e/tf)) 
+ ^ir(a + 1) i ^ a_lsex P(-y/^)^ 

= - I>Ti) (!) ° exp( "^ } + ^fR f ^ 

And from this follows 



or 



For < Z < u, the following equations hold true. 
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r 1 1 

J ^9aAy)dy = ( a _ i)( a _ 2)^2 ( G "-2,4^) - G a -2,o(0)- ( 50 ) 



j\og{y)g a Ay)dy = log(tf)(G M (u) - G a ,*{l)) + . (51) 

J U log(y) 2 g a ^(y)dy = log(0) 2 (G a ,,,(u) - G a j(l)) 

+ 21ogW *u>(<,») + „ f (',»). (52) 

ylog(y)g a ^(y)dy =atf log(tf)(G a+M (u) - 



^ lo^/)^ (y) ^ = 1 log(0)(G Q _ M (u) - G 0-1^(1)) 



(a -1)0 a_i \i? '0 
S.4 Descriptive statistics for covariates 





Mean 


Standard Deviation 






RDT 


0.65 


0.30 






Face Value (log) 


3.84 


0.54 






Insured Fraction 


0.06 


0.03 








Low / Small 


Mid / Medium 


High / Large 




Experience 


15.52 


55.38 


29.10 




Size 


85.06 


9.95 


4.98 






Maintenance 


Hybrid 


Performance 


Other 


Type 


88.05 


6.85 


4.42 


0.67 



Table 2: Descriptive statistics for covariates. For categorical variables, the 
frequency (in %) of the levels are given. 
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Histogram of LGD/FV 



to - 



OJ - 



52.44% (52.36%) 
(53.86%) 
(52.68%) 




15.38% (15.73%) 
(13.75%) 
'(15.07%) 



Fitted LGD/FV 

CG 

Tobit 

Skew t Tobit 



0.0 



0.2 



0.4 0.6 

LGD/FV 



0.8 



1.0 



Figure 4: Comparison of fitted censored gamma, normal Tobit, and skew t Tobit 
models with no covariates. The numbers above the blue arrows represent the 
percentage of LGD/FV's being exactly zero or one, respectively. In parentheses 
are the corresponding numbers as predicted by the models. 



S.5 Additional Plots Illustrating Other Fitted Models 

Additionally, two different types of models have been fitted two the data. First, 
the two- limit version of the normal Tobit model and its corresponding two-tiered 
and zero-infl ated extensions. Furth e r, we fitted models using the skewed t- 
distribution ( Azzalini and Capitanio ( 2003l )) where, in each model, the shifted 
Gamma distribution is replaced by a skewed t-distribution. The degrees of 
freedom were chosen to be 1 since this provided the best fit in general. 
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LGD/FV vs. Face Value 



a 
o 
— i 




Quantile 

90 % (non-param.) 
90 % (Normal Tobit) 
90 % (Skew t Tobit) 
80 % (non-param.) 
80% (Normal Tobit) 
80% (Skew t Tobit) 
70 % (non-param.) 
70 % (Normal Tobit) 
70 % (Skew t Tobit) 
60 % (non-param.) 
60 % (Normal Tobit) 
60 % (Skew t Tobit) 



rooo 



10TO0 100'000 
Face Value (Log Scale) 
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Figure 5: Scatter plot of LGD/FV versus face value (on a logarithmic scale). 
The jittered points in the bars below 0.0 and above 1.0 represent bonds with 
LGD/FV being exactly zero and one, respectively. The colored solid lines are 
non-parametrically fitted quantiles and the mean. The dashed and dotted lines 
represent quantiles of the fitted normal Tobit model and the skew t Tobit model, 
respectively. Logarithmic and squared logarithmic face value are taken as co- 
variates. 



Model 

Covariate 


Censored 
Coef Std. Err. 


Two-Tiered 
Coef(/3) Std. Err. Coef(7j Std. Err. 


Zero-Inflated ^ 
Coef(/3) Std. Err. Coef(7) Std. Em 


Intercept 


0.86 0.13 *** 


15. 0.030 *** 15. 0.030 *** 


2.1 0.19 *** 2.1 0.19 **| 


RDT *f A 
Quad 


-0.19 0.044 *** 
-0.27 0.15 • 


1.5 0.049 *** 0.41 0.33 
8.3 0.087 *** -0.70 0.11 *** 


-0.042 0.071 4.5 1.1 ***| 
0.33 0.24 0.59 0.20 


_, . Lin 
Experience _ , 
Quad 


-0.40 0.028 *** 
0.036 0.020 • 


-2.0 0.0041 *** -1.6 0.38 *** 
0.29 0.0091 *** -0.87 0.073 *** 


tS — 

-0.32 0.043 *** 2.3 0.65 

0.013 0.027 0.36 0.17 * $ 


Size ^ n . 

Quad 


0.40 0.16 * 
0.51 0.10 *** 


2.8 0.051 *** 0.036 0.053 
0.10 0.0098 *** 0.91 0.40 * 


0.23 0.21 -0.094 0.092 § 
0.049 0.13 -0.27 0.51 g 


Lin 

Face Value _ , 
Quad 


-0.22 0.02b 
0.22 0.025 *** 


-5.4 0.0017 1.4 0.25 
0.27 0.0021 *** -0.047 0.068 


-0.50 0.042 -1.5 0.31 
0.26 0.034 *** -1.5 0.31 ***" 


Hybrid 
Type Performance 
Other 


1.4 0.47 ** 
-0.036 0.052 
-0.019 0.071 


8.9 0.021 *** 0.64 0.075 *** 
0.96 1.7 3.3 1.5 * 
2.4 0.031 *** -0.16 0.14 


1.0 0.46 * -0.78 0.32 * 
0.0095 0.061 -3.5 2.7 
0.035 0.10 0.19 0.31 


Ins. Frac. 


0.29 0.19 


8.7 0.99 *** -0.28 0.19 


0.76 0.22 *** 0.26 0.37 




Value Std. Err. 


Value Std. Err. 


Value Std. Err. 


log(cr) 


-0.040 0.016 


0.81 0.0014 


-0.15 0.022 


Log-Likelihood 


-8241.6 


-7864.4 


-8169 


AIC 


16511.2 


15780.8 


16390 



Table 3: Fitted censored, two-tiered, and zero-inflated normal Tobit models including all covariates. Codes for significance levels: 
'***': p < 0.001, '**': 0.001 < p < 0.01, '*': 0.01 < p < 0.05, '.': 0.05 < p < 0.1. 



CO 

GO 



£ 

o 



Model 

Covariate 


Censored 
Coef Std. Err. 


Two-Tiered 
Coef(/3) Std. Err. Coeffr) Std. Err. 


, W 

r / -f- f\ , l r/i 

Zero-lnfiated o 
Coef(/3) Std. Err. Coef(7) StdgErr. 


Intercept 


-0.38 0.056 


-1.9 0.10 0.015 0.0090 • 


-0.090 0.075 -0.49 0.33O 


RDT Lin 

Quad 


-0.15 0.023 *** 
-0.38 0.073 *** 


-0.15 0.037 *** -0.010 0.0025 *** 
-0.36 0.12 ** -0.026 0.0080 ** 


-0.19 0.041 *** -0.17 0.163 
-0.39 0.14 ** 0.23 0.54§ 


_, . Lin 
Experience ^ , 
Quad 


-0.12 0.013 *** 
-0.013 0.0094 


0.20 0.034 *** -0.016 0.0037 *** 
-0.060 0.022 ** 0.0020 0.0011 • 


0.050 0.026 • 1.3 0.21S?** 
-0.0017 0.023 -0.28 0.153- 

£B 


Size Lin 
Quad 


0.26 0.10 ** 
0.33 0.069 *** 


-0.20 0.13 0.012 0.0060 • 
0.0064 0.073 0.021 0.0048 *** 


0.42 0.15 ** 0.48 0.41$ 
0.33 0.10 ** -0.11 0.30§ 


Face Value , 
Quad 


0.038 0.010 *** 
0.042 0.0064 *** 


0.40 0.019 *** 0.0024 0.00079 ** 
-0.12 O.Oll *** O.Oll 0.0028 *** 


0.030 0.0074 *** 0.040 0.06^ 
0.0069 0.016 -0.44 0.08$.*** 


Hybrid 
lype Performance 
Other 


0.56 0.11 *** 
-0.038 0.021 • 
-0.12 0.0099 *** 


-2.1 0.48 *** 0.066 0.033 * 
-0.0033 0.026 -0.0038 0.0024 
-0.050 0.031 -0.0070 0.0033 * 


0.020 0.22 -1.8 1.1 w 
-0.034 0.037 -0.038 0.19 
-0.00075 0.062 0.25 0.21 


Ins. Frac. 


-0.13 0.085 


-1.1 0.64 • -0.013 0.0098 


0.50 0.084 *** 1.1 0.34 ** 




Value Std. Err. 


Value Std. Err. 


Value Std. Err. 


Skew t Par. log(<r) 

a 


1 

-1.1 0.028 
30. 23. 


1 

-3.5 0.22 
-1.0 0.31 


1 

-0.90 0.033 
38. 27. 


Log-Likelihood 


-8019 


-7692.4 


-7964.4 


AIC 


16067.9 


15440.7 


15984.8 



Table 4: Fitted censored, two-tiered, and zero-inflated skew t (df =1) Tobit models including all covariates. Codes for significance 
levels: '***': p < 0.001, '**': 0.001 < p < 0.01, '*': 0.01 < p < 0.05, '.': 0.05 < p < 0.1. 



CO 
CD 



